DNA Sequencing via Quantum Mechanics and Machine Learning
نویسندگان
چکیده
Rapid sequencing of individual human genome is prerequisite to genomic medicine, where diseases will be prevented by preemptive cures. Quantum-mechanical tunneling through single-stranded DNA in a solid-state nanopore has been proposed for rapid DNA sequencing, but unfortunately the tunneling current alone cannot distinguish the four nucleotides due to large fluctuations in molecular conformation and solvent. Here, we propose a machine-learning approach applied to the tunneling current-voltage (I-V) characteristic for efficient discrimination between the four nucleotides. We first combine principal component analysis (PCA) and fuzzy c-means (FCM) clustering to learn the “fingerprints” of the electronic density-of-states (DOS) of the four nucleotides, which can be derived from the I-V data. We then apply the hidden Markov model and the Viterbi algorithm to sequence a time series of DOS data (i.e., to solve the sequencing problem). Numerical experiments show that the PCA-FCM approach can classify unlabeled DOS data with 91% accuracy. Furthermore, the classification is found to be robust against moderate levels of noise, i.e., 70% accuracy is retained with a signal-to-noise ratio of −26 dB. The PCA-FCM-Viterbi approach provides a 4fold increase in accuracy for the sequencing problem compared with PCA alone. In conjunction with recent developments in nanotechnology, this machine-learning method may pave the way to the much-awaited rapid, low-cost genome sequencer. 1 Corresponding Author. Fax: +1 (213) 821-2664, Email: [email protected]. International Journal of Computational Science 1992-6669 © Global Information Publisher 20xx, Vol. x, No. x, xx-xx
منابع مشابه
Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means
One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...
متن کاملTiming, Sequencing, and Quantum of Life Course Events: A Machine Learning Approach
In this paper we discuss and apply machine learning techniques, using ideas from a core research area in the artificial intelligence literature to analyse simultaneously timing, sequencing, and quantum of life course events from a comparative perspective. We outline the need for techniques which allow the adoption of a holistic approach to life course analysis, illustrating the specific case of...
متن کاملTiming, Sequencing and Quantum f Life Course Events: a Machine Learning Approach
In this methodological paper we discuss and apply machine learning techniques, a core research area in the artificial intelligence literature, to analyse simultaneously timing, sequencing, and quantum of life course events from a comparative perspective. We outline the need for techniques which allow the adoption of a holistic approach to the analysis of life courses, illustrating the specific ...
متن کاملDynamic Network Analysis
Dynamic network analysis (DNA) varies from traditional social network analysis in that it can handle large dynamic multi-mode, multi-link networks with varying levels of uncertainty. DNA, like quantum mechanics, would be a theory in which relations are probabilistic, the measurement of a node changes its properties, movement in one part of the system propagates through the system, and so on. Ho...
متن کاملQuantum adiabatic machine learning
We develop an approach to machine learning and anomaly detection via quantum adiabatic evolution. In the training phase we identify an optimal set of weak classifiers, to form a single strong classifier. In the testing phase we adiabatically evolve one or more strong classifiers on a superposition of inputs in order to find certain anomalous elements in the classification space. Both the traini...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1012.0900 شماره
صفحات -
تاریخ انتشار 2010